GSTaxClassifier: a genomic signature based taxonomic classifier for metagenomic data analysis

نویسندگان

  • Fahong Yu
  • Yijun Sun
  • Li Liu
  • William Farmerie
چکیده

GSTaxClassifier (Genomic Signature based Taxonomic Classifier) is a program for metagenomics analysis of shotgun DNA sequences. The program includes a simple but effective algorithm, a modification of the Bayesian method, to predict the most probable genomic origins of sequences at different taxonomical ranks, on the basis of genome databases;a function to generate genomic profiles of reference sequences with tri-, tetra-, penta-, and hexa-nucleotide motifs for setting a user-defined database; two different formats (tabular- and tree-based summaries) to display taxonomic predictions with improved analytical methods; and effective ways to retrieve, search, and summarize results by integrating the predictions into the NCBI tree-based taxonomic information.GSTaxClassifier takes input nucleotide sequences and using a modified Bayesian model evaluates the genomic signatures between metagenomic query sequences and reference genome databases. The simulation studies of a numerical data sets showed that GSTaxClassifier could serve as a useful program for metagenomics studies, which is freely available at http://helix2.biotech.ufl.edu:26878/metagenomics/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classification of Metagenomics Data at Lower Taxonomic Level Using a Robust Supervised Classifier

As more and more completely sequenced genomes become available, the taxonomic classification of metagenomic data will benefit greatly from supervised classifiers that can be updated instantaneously in response to new genomes. Currently, some supervised classifiers have been developed to assess the organism of metagenomic sequences. We have found that the existing supervised classifiers usually ...

متن کامل

Evidence-Based Clustering of Reads and Taxonomic Analysis of Metagenomic Data

The rapidly emerging field of metagenomics seeks to examine the genomic content of communities of organisms to understand their roles and interactions in an ecosystem. In this paper we focus on clustering methods and their application to taxonomic analysis of metagenomic data. Clustering analysis for metagenomics amounts to group similar partial sequences, such as raw sequence reads, into clust...

متن کامل

MyTaxa: an advanced taxonomic classifier for genomic and metagenomic sequences

Determining the taxonomic affiliation of sequences assembled from metagenomes remains a major bottleneck that affects research across the fields of environmental, clinical and evolutionary microbiology. Here, we introduce MyTaxa, a homology-based bioinformatics framework to classify metagenomic and genomic sequences with unprecedented accuracy. The distinguishing aspect of MyTaxa is that it emp...

متن کامل

IMP : a pipeline for reproducible integrated 1 metagenomic and metatranscriptomic analyses

20 We present IMP, an automated pipeline for reproducible integrated analyses of coupled 21 metagenomic and metatranscriptomic data. IMP incorporates preprocessing, iterative co22 assembly of metagenomic and metatranscriptomic data, analyses of microbial community 23 structure and function as well as genomic signature-based visualizations. Complementary use 24 of metagenomic and metatranscripto...

متن کامل

Alignment-free Visualization of Metagenomic Data by Nonlinear Dimension Reduction

The visualization of metagenomic data, especially without prior taxonomic identification of reconstructed genomic fragments, is a challenging problem in computational biology. An ideal visualization method should, among others, enable clear distinction of congruent groups of sequences of closely related taxa, be applicable to fragments of lengths typically achievable following assembly, and all...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2009